AITopics | combinatorial perspective

A Combinatorial Perspective on the Optimization of Shallow ReLU Networks

Neural Information Processing SystemsDec-24-2025, 18:11:12 GMT

The NP-hard problem of optimizing a shallow ReLU network can be characterized as a combinatorial search over each training example's activation pattern followed by a constrained convex problem given a fixed set of activation patterns. We explore the implications of this combinatorial aspect of ReLU optimization in this work. We show that it can be naturally modeled via a geometric and combinatoric object known as a zonotope with its vertex set isomorphic to the set of feasible activation patterns. This assists in analysis and provides a foundation for further research. We demonstrate its usefulness when we explore the sensitivity of the optimal loss to perturbations of the training data. Later we discuss methods of zonotope vertex selection and its relevance to optimization. Overparameterization assists in training by making a randomly chosen vertex more likely to contain a good solution. We then introduce a novel polynomial-time vertex selection procedure that provably picks a vertex containing the global optimum using only double the minimum number of parameters required to fit the data. We further introduce a local greedy search heuristic over zonotope vertices and demonstrate that it outperforms gradient descent on underparameterized problems.

combinatorial perspective, name change, optimization, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.60)

Add feedback

A Combinatorial Perspective on Transfer Learning

Neural Information Processing SystemsDec-23-2025, 17:52:32 GMT

Human intelligence is characterized not only by the capacity to learn complex skills, but the ability to rapidly adapt and acquire new skills within an ever-changing environment. In this work we study how the learning of modular solutions can allow for effective generalization to both unseen and potentially differently distributed data. Our main postulate is that the combination of task segmentation, modular learning and memory-based ensembling can give rise to generalization on an exponentially growing number of unseen tasks. We provide a concrete instantiation of this idea using a combination of: (1) the Forget-Me-Not Process, for task segmentation and memory based ensembling; and (2) Gated Linear Networks, which in contrast to contemporary deep learning techniques use a modular and local learning mechanism. We demonstrate that this system exhibits a number of desirable continual learning properties: robustness to catastrophic forgetting, no negative transfer and increasing levels of positive transfer as more tasks are seen. We show competitive performance against both offline and online methods on standard continual learning benchmarks.

combinatorial perspective, name change, transfer learning, (5 more...)

Neural Information Processing Systems

Industry: Education (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

A Combinatorial Perspective on the Optimization of Shallow ReLU Networks

Neural Information Processing SystemsMay-27-2025, 14:01:55 GMT

The NP-hard problem of optimizing a shallow ReLU network can be characterized as a combinatorial search over each training example's activation pattern followed by a constrained convex problem given a fixed set of activation patterns. We explore the implications of this combinatorial aspect of ReLU optimization in this work. We show that it can be naturally modeled via a geometric and combinatoric object known as a zonotope with its vertex set isomorphic to the set of feasible activation patterns. This assists in analysis and provides a foundation for further research. We demonstrate its usefulness when we explore the sensitivity of the optimal loss to perturbations of the training data.

activation pattern, combinatorial perspective, shallow relu network, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.43)

Add feedback

Review for NeurIPS paper: A Combinatorial Perspective on Transfer Learning

Neural Information Processing SystemsJan-21-2025, 09:48:18 GMT

Additional Feedback: My major concern is that the authors have only applied their method to variants of MNIST. While the experiments performed are indeed from the established Continual Learning benchmarks in prior work, they do not suffice to showcase the true complexity of the continual learning challenge. I would strongly recommend doing at least some RL experiments, for instance, as performed in Online EWC paper. Secondly, as mentioned above the descriptions of GGM, FMN and NCTL are quite terse to understand and need to be re-read a couple times to make sense of them. I'd recommend simplifying these descriptions for an easier flow and deferring the details to an appendix.

combinatorial perspective, neurips paper, transfer learning, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.40)

Add feedback

Review for NeurIPS paper: A Combinatorial Perspective on Transfer Learning

Neural Information Processing SystemsJan-21-2025, 09:48:10 GMT

This paper studies continual learning that does not require task boundary and identity information and proposes a novel model ensemble method from the combinatorial perspective for this problem. All reviewers and AC agree that this paper builds a novel and promising direction. Authors also design delicate algorithm by introducing the non-stationary learning techniques to solve this problem. The experimental results of this method are somewhat weak in several aspects, but given the challenge of online continual learning in nature, they are fairly convincing to justify the main ideas and proposed methods. Note that after rebuttal and discussion phases, there still remain several major concerns: First, the empirical evaluation is not realistic in terms of task diversity and scalability.

combinatorial perspective, neurips paper, transfer learning, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.40)

Add feedback

A Combinatorial Perspective on the Optimization of Shallow ReLU Networks

Neural Information Processing SystemsJan-17-2025, 15:42:43 GMT

The NP-hard problem of optimizing a shallow ReLU network can be characterized as a combinatorial search over each training example's activation pattern followed by a constrained convex problem given a fixed set of activation patterns. We explore the implications of this combinatorial aspect of ReLU optimization in this work. We show that it can be naturally modeled via a geometric and combinatoric object known as a zonotope with its vertex set isomorphic to the set of feasible activation patterns. This assists in analysis and provides a foundation for further research. We demonstrate its usefulness when we explore the sensitivity of the optimal loss to perturbations of the training data.

activation pattern, combinatorial perspective, shallow relu network, (2 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.43)

Add feedback

A Combinatorial Perspective on Transfer Learning

Neural Information Processing SystemsOct-9-2024, 11:53:19 GMT

Human intelligence is characterized not only by the capacity to learn complex skills, but the ability to rapidly adapt and acquire new skills within an ever-changing environment. In this work we study how the learning of modular solutions can allow for effective generalization to both unseen and potentially differently distributed data. Our main postulate is that the combination of task segmentation, modular learning and memory-based ensembling can give rise to generalization on an exponentially growing number of unseen tasks. We provide a concrete instantiation of this idea using a combination of: (1) the Forget-Me-Not Process, for task segmentation and memory based ensembling; and (2) Gated Linear Networks, which in contrast to contemporary deep learning techniques use a modular and local learning mechanism. We demonstrate that this system exhibits a number of desirable continual learning properties: robustness to catastrophic forgetting, no negative transfer and increasing levels of positive transfer as more tasks are seen.

combinatorial perspective, task segmentation, transfer learning, (1 more...)

Neural Information Processing Systems

Industry: Education (1.00)

Technology: